A Collapsing Method for the Efficient Recovery of Optimal Edges in Phylogenetic Trees

نویسندگان

  • Michael Hu
  • Paul E. Kearney
  • Jonathan H. Badger
چکیده

As the amount of sequencing efforts and genomic data volume continue to increase at an accelerated rate, phylogenetic analysis provides an evolutionary context for understanding and interpreting this growing set of complex data. We introduce a novel quartet based method for inferring molecular based phylogeny called hypercleaning* (HC). The HC method is based on the hypercleaning (HC) technique, which possesses an interesting property of recovering edges (of a phylogenetic tree) that are best supported by the witness quartet set. HC extends HC in two regards: (i) whereas HC constrains the input quartet set to be unweighted (binary valued), HC allows any positive valued quartet scores, enabling more informative quartets to be defined. (ii) HC employs a novel collapsing technique which significantly speeds up the inference stage, making it empirically on par with quartet puzzling in terms of speed, while still guaranteeing optimal edge recovery as in HC. This paper is primarily aimed at presenting the algorithmic construction of HC. We also report some preliminary studies on an implementation of HC as a potentially powerful approximation scheme for maximum likelihood based inference. Details of proofs can be found in report at: (www.michaelhu.com/reports/ mmath thesis.pdf).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Collapsing Method for Efficient Recovery of Optimal Edges in Phylogenetic Trees

In this thesis we present a novel algorithm, HyperCleaning∗ for effectively inferring phylogenetic trees. The method is based on the quartet method paradigm and is guaranteed to recover the best supported edges of the underlying phylogeny based on the witness quartet set. This is performed efficiently using a collapsing mechanism that employs memory/time tradeoff to ensure no loss of informatio...

متن کامل

Quantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species

Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...

متن کامل

Quantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species

Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International Journal on Artificial Intelligence Tools

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2003